Picture for Adel Bibi

Adel Bibi

$D^2$-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing

Add code
May 25, 2026
Viaarxiv icon

OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage

Add code
Feb 13, 2026
Viaarxiv icon

Reliable and Responsible Foundation Models: A Comprehensive Survey

Add code
Feb 04, 2026
Viaarxiv icon

A Fragile Guardrail: Diffusion LLM's Safety Blessing and Its Failure Mode

Add code
Jan 30, 2026
Viaarxiv icon

The Alignment Curse: Cross-Modality Jailbreak Transfer in Omni-Models

Add code
Jan 30, 2026
Viaarxiv icon

It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents

Add code
Dec 29, 2025
Viaarxiv icon

Unforgotten Safety: Preserving Safety Alignment of Large Language Models with Continual Learning

Add code
Dec 10, 2025
Viaarxiv icon

Rethinking Safety in LLM Fine-tuning: An Optimization Perspective

Add code
Aug 17, 2025
Viaarxiv icon

Attacking Multimodal OS Agents with Malicious Image Patches

Add code
Mar 13, 2025
Figure 1 for Attacking Multimodal OS Agents with Malicious Image Patches
Figure 2 for Attacking Multimodal OS Agents with Malicious Image Patches
Figure 3 for Attacking Multimodal OS Agents with Malicious Image Patches
Figure 4 for Attacking Multimodal OS Agents with Malicious Image Patches
Viaarxiv icon

Shh, don't say that! Domain Certification in LLMs

Add code
Feb 26, 2025
Viaarxiv icon